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DETAILED ACTION 

Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including the fee set forth in 
37 CFR 1 .17(e), was filed in this application after final rejection. Since this application is 
eligible for continued examination under 37 CFR 1.1 14, and the fee set forth in 37 CFR 1.17(e) 
has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 
37 CFR 1.1 14. Applicant's submission filed on 2/13/08 has been entered. 



Response to Arguments 

2. Applicant's arguments filed on 2/13/08 have been fully considered but they are not 
persuasive. Applicant argues that the prior art reference Chen in view of Brockett do not teach 
the feature of processing a sentence of Chinese characters into constituent words. The examiner 
respectfully disagrees and points out that Chen does teach segmenting a sentence of Chinese 
characters into constituent Chinese words having one or more Chinese characters (col. 3, lines 
18-32). With regard to Brockett, even though, he gives out examples of Japanese text, he states 
in col. 1, lines 40-48 that he teaches processing non-segmented text like Japanese or Chinese. 

As per obtaining probability information based on at least one context feature adjacent 
the overlapping ambiguity string and at least part of the recognized OAS for each of the FMM 
and BMM. Col. 6, lines 6-42 of Brockett necessarily discloses above cited limitation within the 
process wherein the system checks the context feature of adjacent to the OAS to identify the 
ABCD string's substrings, i.e. AB, BC, ABC). 
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As per the rest of the claims, applicant has no further arguments beside the ones 
mentioned above. Therefore, all the combinations of prior art reference mentioned above are 
valid, and all other dependent claims are rejected for the same reasons as set above. 

Information Disclosure Statement 

3. The information disclosure statement filed 08/17/2007 fails to comply with 37 CFR 
1.98(a)(3) because it does not include a concise explanation of the relevance, as it is presently 
understood by the individual designated in 37 CFR 1.56(c) most knowledgeable about the 
content of the information, of each patent listed that is not in the English language. The IDS 
includes a concise statement of relevancy with the current application, applicant is required to 
submit the complete English translation, in order for them to be considered. It has been placed in 
the application file, but the information referred to therein has not been considered. 

Claim Rejections - 35 USC §103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

Claims 1-4, 6-7, 14, 15-21, 23, 25-26, and 28 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Chen et al. (U.S 5,806,021 issued on Sept. 8, 1998) (hereinafter: Chen) 
in view of Brockett et al. (U.S 6,968,308, filed Nov. 1, 2000 and issued on Nov. 22, 2005) 
(hereinafter: Brockett). 
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As per claims 1, 14, and 25, Chen teaches segmenting a sentence of Chinese characters 
into constituent Chinese words having one or more Chinese characters (col. 3, lines 18-32); 

performing a Forward Maximum Matching (FMM) segmentation (col. 3, lines 37-65) and 
a Backward Maximum Matching (BMM) segmentation (col. 3, line 66 - col. 4, line24); 

generating an n-gram model (col. 4, lines 45-47), and 

selecting one of the two segmentations as a function of probability information for the 
two segmentations (col. 4, lines 25-26). 

Chen does not explicitly teach recognizing an overlapping ambiguity string in the 
segmented sentence, wherein the overlapping ambiguity string comprises at least three Chinese 
characters having at least two possible segmentations, obtaining probability information based on 
at least one context feature adjacent the overlapping ambiguity string and at least part of the 
recognized OAS for each of the FMM and BMM; outputting an indication for selecting one of 
the at least two possible segmentations as a function of the obtained probability information; and 
replacing the overlapping ambiguity string with tokens. 

Brockett in the same field of endeavor teaches recognizing the overlapping ambiguity 
string in the segmented sentence, wherein the overlapping ambiguity string comprises at least 
three Chinese characters having at least two possible segmentations (col. 1, lines 40-48, wherein 
the processed text is non-segmented text like Japanese or Chinese, col. 2, lines 16-17 and col. 10, 
lines 41-49, wherein the recognized overlapping ambiguity string comprises at least three 
Chinese characters having at least two possible segmentations), obtaining probability 
information based on at least one context feature adjacent the overlapping ambiguity string and 
at least part of the recognized OAS for each of the FMM and BMM (necessarily disclosed within 
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the process of col. 6, lines 6-42, wherein the system checks the context feature of adjacent to the 
OAS to identify the ABCD string's substrings, i.e. AB, BC, ABC); outputting an indication for 
selecting one of the two segmentations as a function of the obtained probability information (col. 
11, lines 5-19, wherein the most probable segmentation of the input text is selected), and 
replacing the overlapping ambiguity string with tokens (inherent in selecting the most 
segmentation for the input string (col. 11, lines 5-19). 

Therefore, it would have been obvious to a person of ordinary skill in the art at the time 
of the invention was made to apply the features of the overlapping ambiguity string recognizer of 
Brockett to the text segmentation system of Chen, to resolve the overlapping ambiguity of 
unsegmented input strings, because Brockett suggests that this would better identify the right 
segment among the competing segments (col. 1, lines 55-63). 

As per claims 2-4, 23, and 26, Chen in view of Brockett teach obtaining the probability 
information from a language model (lexicon, col. 2, line 41) based on the at least one context 
feature and a left or right portion of the overlapping ambiguity string (necessarily disclosed for 
determining word boundaries, col. 2, lines 39-44), wherein the language model comprises a 
trigram model (col. 2, lines 45-49), wherein outputting an indication for selecting one of the at 
least two possible segmentations comprises classifying the probability information (col. 3, lines 
29-32, wherein the probability information (likelihood) of both segmentations is calculated and 
classified to select the segmentation with higher likelihood). 

As per claims 6-7, and 28, Chen teaches performing a Forward Maximum Matching 
(FMM) segmentation, for recognizing a segmentation O f , (col. 3, lines 37-65) and a Backward 
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Maximum Matching (BMM) segmentation for recognizing a segmentation Ob of the input 
sentence (col. 3, line 66 - col. 4, line24). 

Chen does not explicitly teach recognizing an overlapping ambiguity string in the input 
sentence as a function of the two segmentations. 

Brockett in the same field of endeavor teaches recognizing the overlapping ambiguity 
string in the input sentence as a function of the two segmentations (col. 2, lines 16-17). 

Therefore, it would have been obvious to a person of ordinary skill in the art at the time 
of the invention was made to combine the overlapping ambiguity string recognizer of Brockett to 
the text segmentation system of Chen, because Brockett suggests that this would better identify 
the right segment among the competing segments (col. 1, lines 55-63). 

As per claim 15, Chen teaches determining a probability associated with each of the 
FMM segmentation of the overlapping ambiguity string and the BMM segmentation of the 
overlapping ambiguity string based on higher probability (col. 3, lines 18-32, wherein the 
segmentation with higher likelihood is chosen). 

As per claims 16-18, Chen teaches an N-gram model (col. 4, lines 45-47), and 
probability information about a first and last word of the overlapping ambiguity string (col. 5, 
lines 1-5, wherein probability of each part of the phrase (word), resulted from a segmentation is 
compared separately). 

As per claims 19-21, Chen teaches N-gram model (col. 4, lines 45-47), that uses trigram 
probability information about a string of words comprising a first word of the overlapping 
ambiguity string and two context words to the left of the first word, and a last word of the 
overlapping ambiguity string and two context words to the right of the last word (inherently 
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disclosed in the process of determining likelihood scores using n-grams models (tri-gram model), 
col. 5, lines 45-47). 

Claims 5, 8-12, 22, 24, 27, and 29-31, are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Chen in view of Brockett, as applied to claims 4, 15, and 23, and further in 
view of Pedersen ("A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for 
Word Sense Disambiguation'", in Proceedings of the First Annual Meeting of the North 
American Chapter of the Association for Computational Linguistics, pp. 63-69, April 29 - May 
4, 2000). 

As per claim 5, 22, and 24, Chen in view of Brockett teaches all the limitations of claims 
4, 15, and 23, upon which claims 5, 22, and 24 depend. 

Chen and Brockett do not explicitly teach using an ensemble of Naive Bayesian 
Classifiers. 

Pederson in the same field of endeavor teaches using an ensemble of Naive Bayesian 
Classifiers (Abstract). 

Therefore, it would have been obvious to a person of ordinary skill in the art at the time 
of the invention was made to combine Pederson' s Nave Bayesian Classifier with the automatic 
text segmenter of Chen, because Pederson suggests that this would provide more accurate 
disambiguation systems (Abstract). 

As per claims 8-12, Chen in view of Brockett teach one of the two segmentations (col. 4, 
lines 25-26), classifying the probability information of Of and O b (col. 3, lines 29-32, wherein 
the probability information (likelihood) of both segmentations is calculated and classified to 
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select the segmentation with higher likelihood), and determining which one of the said 
probabilities is higher (col. 4, lines 25-26). 

Chen and Brockett do not explicitly selecting one of the at least two segmentations as a 
function of a set of context features, words around the overlapping ambiguity string, associated 
with the overlapping ambiguity string, classifying the probability information of the context 
features surrounding the overlapping ambiguity string, and determining which one of the said 
probabilities is higher, as a function of the set of context features. 

Pederson in the same field of endeavor teaches the Naive Bayesian Classifier for word 
sense disambiguation based on windows of context (Pages 63-64). 

Therefore, it would have been obvious to a person of ordinary skill in the art at the time 
of the invention was made to use the Naive Bayesian Classifier of Pederson in combination with 
the text segmenting system of Chen, to use the probability information of the context features to 
select one of the two segmentations. Pederson suggests that this would provide more accurate 
disambiguation systems (Abstract). 

As per claims 27 and 29, Chen in view of Brockett teaches all the limitations of claims 
25 and 28, upon which claims 27 and 29 depend. 

Chen and Brockett do not explicitly teach generating an ensemble of classifiers as a 
function of an n-gram model. 

Pederson in the same field of endeavor teaches generating an ensemble of classifiers as a 
function of an n-gram model (Abstract, and page 64, col. 2, lines 15-19). 

Therefore, it would have been obvious to a person of ordinary skill in the art at the time 
of the invention was made to combine Pederson's classifiers with the combined system of Chen 
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and Brockett, because Pederson suggests that this would provide more accurate disambiguation 
systems (Abstract). 

As per claim 30, Chen, Brockett, and Pederson teach all the limitations of claim 29, upon 
which claim 30 depends. Chen in view of Brockett, furthermore, teach approximating 
probabilities of the FMM and BMM segmentations of each overlapping ambiguity string as 
being equal to the product of individual unigram probabilities of individual words in the FMM 
and BMM segmentations respectively, of the overlapping ambiguity string (col. 3, line 37 -col. 
4, line 26, wherein the probabilities of the FMM and BMM segmentations of each overlapping 
ambiguity are approximated and compare to choose the one with the highest score). 

As per claim 31, Chen, Brockett, and Pederson teach all the limitations of claim 30, upon 
which claim 31 depends. Pederson, furthermore, teach a joint probability of a set of context 
features conditioned on an existence of one of the segmentations of each overlapping ambiguity 
string (ambiguous word) as a function of a corresponding probability of a leftmost and a 
rightmost word of the corresponding overlapping ambiguity string (Pages 63-64, 2 nd paragraph, 
NaiveBayesian Classifiers). 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Abdelali Serrou whose telephone number is 571-272-7638. The 
examiner can normally be reached on 8:30-5:00. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David R. Hudspeth can be reached on 571-272-7843. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Abdelali Serrou/ 5/9/08 
Examiner, Art Unit 2626 

/David R Hudspeth/ 

Supervisory Patent Examiner, Art Unit 2626 



